ERANNs: Efficient residual audio neural networks for audio pattern recognition
نویسندگان
چکیده
Audio pattern recognition (APR) is an important research topic and can be applied to several fields related our lives. Therefore, accurate efficient APR systems need developed as they are useful in real applications. In this paper, we propose a new convolutional neural network (CNN) architecture method for improving the inference speed of CNN-based tasks. Moreover, using proposed method, improve performance systems, confirmed experiments conducted on four audio datasets. addition, investigate impact data augmentation techniques transfer learning systems. Our best system achieves mean average precision (mAP) 0.450 AudioSet dataset. Although value less than that state-of-the-art system, 7.1x faster 9.7x smaller. On ESC-50, UrbanSound8K, RAVDESS datasets, obtain results with accuracies 0.961, 0.908, 0.748, respectively. ESC-50 dataset 1.7x 2.3x smaller previous system. For dataset, 3.3x We name "Efficient Residual Neural Networks".
منابع مشابه
Audio Chord Recognition with Recurrent Neural Networks
In this paper, we present an audio chord recognition system based on a recurrent neural network. The audio features are obtained from a deep neural network optimized with a combination of chromagram targets and chord information, and aggregated over different time scales. Contrarily to other existing approaches, our system incorporates acoustic and musicological models under a single training o...
متن کاملEfficient Neural Audio Synthesis
Sequential models achieve state-of-the-art results in audio, visual and textual domains with respect to both estimating the data distribution and generating high-quality samples. Efficient sampling for this class of models has however remained an elusive problem. With a focus on text-to-speech synthesis, we describe a set of general techniques for reducing sampling time while maintaining high o...
متن کاملPrecision Scaling of Neural Networks for Efficient Audio Processing
While deep neural networks have shown powerful performance in many audio applications, their large computation and memory demand has been a challenge for real-time processing. In this paper, we study the impact of scaling the precision of neural networks on the performance of two common audio processing tasks, namely, voice-activity detection and single-channel speech enhancement. We determine ...
متن کاملANN Paradigms for Audio Pattern Recognition
Pattern Recognition is the process to classify data or patterns based on either a priori knowledge or on statistical information extracted from the patterns. An audio pattern recognition problem is based on speech patterns spoken, which can be interpreted as speaker dependent or speaker independent. Artificial Neural Network (ANN) is information processing machine learning model, inspired by bi...
متن کاملAudio Visual Speech Recognition Using Deep Recurrent Neural Networks
In this work, we propose a training algorithm for an audiovisual automatic speech recognition (AV-ASR) system using deep recurrent neural network (RNN).First, we train a deep RNN acoustic model with a Connectionist Temporal Classification (CTC) objective function. The frame labels obtained from the acoustic model are then used to perform a non-linear dimensionality reduction of the visual featu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition Letters
سال: 2022
ISSN: ['1872-7344', '0167-8655']
DOI: https://doi.org/10.1016/j.patrec.2022.07.012